AITopics | language generation model

Collaborating Authors

language generation model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Language Generation with Strictly Proper Scoring Rules

Shao, Chenze, Meng, Fandong, Liu, Yijin, Zhou, Jie

arXiv.org Artificial IntelligenceMay-29-2024

Language generation based on maximum likelihood estimation (MLE) has become the fundamental approach for text generation. Maximum likelihood estimation is typically performed by minimizing the log-likelihood loss, also known as the logarithmic score in statistical decision theory. The logarithmic score is strictly proper in the sense that it encourages honest forecasts, where the expected score is maximized only when the model reports true probabilities. Although many strictly proper scoring rules exist, the logarithmic score is the only local scoring rule among them that depends exclusively on the probability of the observed sample, making it capable of handling the exponentially large sample space of natural text. In this work, we propose a straightforward strategy for adapting scoring rules to language generation, allowing for language modeling with any non-local scoring rules. Leveraging this strategy, we train language generation models using two classic strictly proper scoring rules, the Brier score and the Spherical score, as alternatives to the logarithmic score. Experimental results indicate that simply substituting the loss function, without adjusting other hyperparameters, can yield substantial improvements in model's generation capabilities. Moreover, these improvements can scale up to large language models (LLMs) such as LLaMA-7B and LLaMA-13B. Source code: \url{https://github.com/shaochenze/ScoringRulesLM}.

computational linguistic, language generation, logarithmic score, (13 more...)

arXiv.org Artificial Intelligence

2405.18906

Country:

Europe > Austria > Vienna (0.14)
North America > United States > New York > New York County > New York City (0.04)
Europe > Germany > Berlin (0.04)
(8 more...)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Knowledge Graph-Augmented Korean Generative Commonsense Reasoning

Jung, Dahyun, Seo, Jaehyung, Lee, Jaewook, Park, Chanjun, Lim, Heuiseok

arXiv.org Artificial IntelligenceJun-26-2023

Generative commonsense reasoning refers to the task of generating acceptable and logical assumptions about everyday situations based on commonsense understanding. By utilizing an existing dataset such as Korean CommonGen, language generation models can learn commonsense reasoning specific to the Korean language. However, language models often fail to consider the relationships between concepts and the deep knowledge inherent to concepts. To address these limitations, we propose a method to utilize the Korean knowledge graph data for text generation. Our experimental result shows that the proposed method can enhance the efficiency of Korean commonsense inference, thereby underlining the significance of employing supplementary data.

artificial intelligence, commonsense reasoning, natural language, (12 more...)

arXiv.org Artificial Intelligence

2306.1447

Country:

North America > United States > Washington > King County > Seattle (0.05)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.05)
North America > United States > Michigan (0.05)
(3 more...)

Genre: Research Report > New Finding (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.69)

Add feedback

EvoText: Enhancing Natural Language Generation Models via Self-Escalation Learning for Up-to-Date Knowledge and Improved Performance

Yuan, Zhengqing, Xue, Huiwen, Zhang, Chao, Liu, Yongming

arXiv.org Artificial IntelligenceApr-13-2023

In recent years, pretrained models have been widely used in various fields, including natural language understanding, computer vision, and natural language generation. However, the performance of these language generation models is highly dependent on the model size and the dataset size. While larger models excel in some aspects, they cannot learn up-to-date knowledge and are relatively difficult to relearn. In this paper, we introduce EvoText, a novel training method that enhances the performance of any natural language generation model without requiring additional datasets during the entire training process (although a prior dataset is necessary for pretraining). EvoText employs two models: $G$, a text generation model, and $D$, a model that can determine whether the data generated by $G$ is legitimate. Initially, the fine-tuned $D$ model serves as the knowledge base. The text generated by $G$ is then input to $D$ to determine whether it is legitimate. Finally, $G$ is fine-tuned based on $D$'s output. EvoText enables the model to learn up-to-date knowledge through a self-escalation process that builds on a priori knowledge. When EvoText needs to learn something new, it simply fine-tunes the $D$ model. Our approach applies to autoregressive language modeling for all Transformer classes. With EvoText, eight models achieved stable improvements in seven natural language processing tasks without any changes to the model structure.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.3390/app13084758

2302.03896

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
(11 more...)

Genre: Research Report (1.00)

Industry:

Education (0.93)
Media (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Words and images

#artificialintelligenceApr-6-2021, 20:05:03 GMT

As we rely more on natural language processing to help us navigate our world, it's more important than ever that these artificial intelligence models -- used increasingly in applications such as caption generation for the visually impaired -- remain true to reality. "The issue is that deep learning-based neural language generation models have no guarantees in generating factually correct sentences that are faithful to the input data," said UC Santa Barbara computer scientist William Wang. Over the many iterations it takes for a language generation model to learn how to describe or predict what a scene depicts, elements can creep in, causing phenomena such as errors in data-to-text translations or object hallucinations, in which the caption contains an object or an action that doesn't exist in the image. As a result, unless you have a way of reining in these errors (or you're surrealist painter René Magritte) these mismatches could spell the end of the usefulness of the language generation model being used. "This is a huge problem," said Wang. "Imagine you are using a news summarization system to read earnings reports -- the loss of faithfulness can give you wrong numbers, wrong facts and misinformation. Similarly, if a visually impaired person relies on an image captioning system to see the environment, wrong generation could create serious consequences."

language generation model, natural language processing, wang, (8 more...)

#artificialintelligence

Country: North America > United States (0.16)

Genre: Personal (0.36)

Industry:

Media > News (0.36)
Education > Educational Setting > K-12 Education (0.31)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.77)

Add feedback

Researchers release dataset to expose racial, religious, and gender biases in language models

#artificialintelligenceFeb-6-2021, 04:13:24 GMT

Natural language models are the building blocks of apps including machine translators, text summarizers, chatbots, and writing assistants. But there's growing evidence showing that these models risk reinforcing undesirable stereotypes, mostly because a portion of the training data is commonly sourced from communities with gender, race, and religious prejudices. For example, OpenAI's GPT-3 places words like "naughty" or "sucked" near female pronouns and "Islam" near words like "terrorism." A new study from researchers affiliated with Amazon and the University of California, Santa Barbara aims to shed light specifically on biases in open-ended English natural language generation. The researchers created what they claim is the largest benchmark dataset of its kind containing 23,679 prompts, 5 domains, and 43 subgroups extracted from Wikipedia articles.

gender bias, language model, researcher release dataset, (13 more...)

#artificialintelligence

Country: North America > United States > California > Santa Barbara County > Santa Barbara (0.26)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.98)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.62)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.60)

Add feedback

Automatic Conditional Generation of Personalized Social Media Short Texts

Wang, Ziwen, Wang, Jie, Gu, Haiqian, Su, Fei, Zhuang, Bojin

arXiv.org Artificial IntelligenceJun-15-2019

Automatic text generation has received much attention owing to rapid development of deep neural networks. In general, text generation systems based on statistical language model will not consider anthropomorphic characteristics, which results in machine-like generated texts. To fill the gap, we propose a conditional language generation model with Big Five Personality (BFP) feature vectors as input context, which writes human-like short texts. The short text generator consists of a layer of long short memory network (LSTM), where a BFP feature vector is concatenated as one part of input for each cell. To enable supervised training generation model, a text classification model based convolution neural network (CNN) has been used to prepare BFP-tagged Chinese micro-blog corpora. Validated by a BFP linguistic computational model, our generated Chinese short texts exhibit discriminative personality styles, which are also syntactically correct and semantically smooth with appropriate emoticons. With combination of natural language generation with psychological linguistics, our proposed BFP-dependent text generation model can be widely used for individualization in machine translation, image caption, dialogue generation and so on.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-319-97310-4_7

1906.09324

Country:

Asia > China > Beijing > Beijing (0.05)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback